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Abstract 

We propose a purely combinatorial quadratic time algorithm that for any n-vertex P ^-free tour- 

_ c 

narnent T, where Pf c is a directed path of length k, finds in T a transitive subset of order n^° S?. As 

1 _ £ 

a byproduct of our method, we obtain subcubic 0(n kl °sW 2 )-approximation algorithm for the op¬ 
timal acyclic coloring problem on P^-free tournaments. Our results are tight up to the log(fc)-factor 
in the following sense: there exist infinite families of P&-free tournaments with largest transitive sub- 

clog(fc) 

sets of order at most n k . As a corollary, we give tight asymptotic results regarding the so-called 
Erdos-Hajnal coefficients of directed paths. These are some of the first asymptotic results on these 
coefficients for infinite families of prime graphs. 



Keywords: P*.-free tournaments, acyclic colorings, transitive subsets, the Erdos-Hajnal Conjec¬ 
ture 

1 Introduction 

Graph coloring problem is of fundamental importance in computer science. In the undirected setting 
the task is to color all the vertices of the graph to use as few colors as possible and in such a way that 
every color class induces an independent set. The chromatic number y(G') of the undirected graph G 
is the minimum number of colors that can be used under these constraints. In the directed setting 
(EH) the coloring needs to be done in such a way that every color class induces an acyclic digraph. 
Such a coloring is called an acyclic coloring. In particular, when a graph to color is a tournament 
then each color class is a transitive subset (transitive subsets correspond in the tournament setting 
to the independent sets in the undirected one). The number of colors in the optimal acyclic coloring 
of a digraph D is called the dichromatic number Xa(D) of the digraph D. Digraph colorings arise 
in several applications and are thoroughly used in kernel theory and tournament theory thus they 
attracted attention of many researchers. 

The coloring problem is NP-hard but even the stronger statement is true: for every e > 0 finding 
a n 1_e -approximation of the optimal coloring is NP-hard. Due to its hardness, many research efforts 
focused on finding good-quality colorings for several special classes of graphs. For instance, a scenario 
when a graph G under consideration is /c-colorable for some k > 0 was analyzed. The best known 
coloring algorithms for that case give re 1- fc-approximation, where n = |Gj. The best constants c are 
obtained with the use of semidefinite programming asm i2). Another important special class of 
graphs to consider in the coloring context are graphs defined by forbidden patterns. These appear 
in many places in graph theory. For instance, every graph with the topological ordering of vertices 
can be equivalently described as not having directed cycles and every transitive tournament - as 
not having directed triangles. A finite graph is planar if and only if it does not contain (the 

complete graph on five vertices) or A3 3 (complete bipartite graph on six vertices with two equal- 
length color classes) as a minor. One of the deepest results in graph theory, the Robertson-Seymour 
theorem ([H]), states that every family of graphs (not necessarily planar graphs) that is closed 
under minors can be defined by a finite set of forbidden minors. These classes include: forests, 
pseudoforests, linear forests (disjoint unions of path graphs), planar and outerplanar graphs, apex 
graphs, toroidal graphs, graphs that can be embedded on the two-dimensional manifold, graphs with 
bounded treewidth, pathwidth or branchwidth and many more. We should notice that not having a 
certain graph as a minor is a much more restrictive assumption than not having a certain graph H as 
an induced subgraph. Other examples include classes of graphs that can be colored with significantly 
fewer than D( lo ^ n ^ ) colors. For instance, fc-colorable graphs mentioned before do not have as induced 

subgraphs these graphs H that have largest independent sets of size smaller than Thus all those 
classes can be described as not having some forbidden structures (either induced subgraphs in the 
undirected scenario or subdigraphs in the directed setting). 

One of these classes of graphs is of particular interest. Those are P^-free graphs, where Pk is 
an undirected path of k vertices. Not much is known for structural properties of P^-free graphs for 
k > 5. In particular, it is an open question whether finding the largest independent set is NP-hard 
if the class is defined by a forbidden path P \ and k > 5. Coloring P^-free graphs for k > 5 is known 
to be NP-hard. Similarly, no nontrivial approximation algorithms for coloring P&-free graphs are 
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known for k > 5. Thus the question whether there exists a n 1_ fc -approximation algorithm (as it is 
the case for fc-colorable graphs) is open. The completely analogous problem can be considered in the 
directed setting. In other words, one can ask for an optimal acyclic coloring of P^-free tournaments, 
where this time P^ stands for the directed path tournament, i.e. a tournament with the ordering 
of vertices (vi,...,Vk) under which the backward edges are exactly of the form (uj-|_i, Uj). Like in 
the undirected case, the structural theorem of P&-free directed graphs is not known. In particular, 
the question whether the acyclic coloring problem is NP-hard for this class of graphs is open. It is 

l _ c 

striking though that the C)(n fclog ( fc P ^approximation algorithm exists and this is one of our main 
results in this paper. In fact we show a stronger result. We give an algorithm that constructs in 

_ c 

the Pfc-free tournament a transitive set of order n fclog ( fc P. We show that our results are tight up 
to the log (/c)-fact or in the following sense: there exist infinite families of P&-free tournaments with 

c log(fc) 

largest transitive subsets of order at most n k . As a corollary, we give tight asymptotic results 
regarding the so-called Erdds-Hajnal coefficients for directed paths. The coefficients come from the 
celebrated Erdds-Hajnal conjecture - one of the most fundamental unsolved problems in modern 
Ramsey graph theory. Our algorithm for finding big transitive subsets is quadratic in the size of 
the input thus optimal (since the input as a tournament is of size 0(n 2 )) and easy to implement. It 
leads straightforwardly to the subcubic coloring algorithm. 

2 Related work 

Let us discuss briefly some known results regarding P&-free graphs. Graph coloring problem is known 
to be solvable in the polynomial time for P&-free graphs, where k < 4(0). We already mentioned 
that it was proven to be NP-hard for k > 5 010]). A related problem whether a given Pf -free graph is 
A:-colorable (and finding the coloring if the /c-coloring exists) was considered in several papers. In [15j 
it was proven that the 3-colorability question for P 5 -free graphs can be answered in the polynomial 
time. In fact 3-coloring question can be answered in the polynomial time for a more general class of 
Pg-free graphs ([13]). A polynomial algorithm answering a question whether a P 5 -free graph can be 
fc-colored (and constructing the coloring if this is the case) for arbitrary k > 0 was given in [8]. Very 
recently a polynomial algorithm for constructing maximum independent set in P 5 -free graphs was 
proposed (ED- No nontrivial approximation algorithms for the coloring problem of P^-free graphs 
for general k were proposed. 

In the directed setting it was recently proven ([2]) that for every k > 0 all P^-free tournaments 
have polynomial-size transitive subsets, i.e transitive subsets of size Il(n e ) for some e > 0. Coefficients 
e were however obtained with the use of the regularity lemma, an inherent ingredient of the entire 
approach, thus applied methods did not lead to any practically interesting algorithmic results. For 
paths and in fact all prime tournaments those coefficients were proven to be of order at most log ^ 
([ 6 ]) in the worst-case scenarios. That led to the substantial gap between best known upper and 
lower bounds (the latter being only inversely proportional to the tower function from the regularity 
lemma). We practically get rid of that gap in this paper. 

Other results regarding (pseudo)transitive subtournaments of polynomial sizes in P-free directed 
graphs can be found in: m , m, rh ], m , m, m , m, m and [23]. 
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3 Main results 


Before stating formally our results, we will introduce a few important definitions used throughout 
this article. 

All graphs in this paper are finite and simple. Let G be a graph. The vertex set of G is denoted 
by V(G), and the edge set by E(G). We write |G| to mean |V(G)|. We refer to |G| as the order of G. 
A clique in an undirected graph G is a subset of V(G) all of whose elements are pairwise adjacent, 
and an independent set in G is a subset of V(G) all of whose elements are pairwise non-adjacent. 
We say that a graph G is iL-free if it does not have H as an induced subgraph. A tournament is a 
directed graph T, where for every two vertices u,v exactly one of (u,v), (v,u) is an edge of T (that 
is, a directed edge). If (u,v) G E(T ), we say that u is adjacent to v, and that v is adjacent from u. 
Equivalently, v is an outneighbor of u and u is an inneighbor of v. For a given X C V(T) we denote 
by T\X a subtournament of T induced by a vertex set X. For a graph G and a subset V G V(G) 
we denote by G \ V a graph obtained from G by deleting V and all edges of G that are: adjacent 
to a vertex v G V in the undirected setting and: adjacent to or from a vertex v G V in the directed 
setting. A tournament is transitive if it contains no directed cycle (equivalently, no directed cycle of 
length three). A set is transitive if it induces a transitive subtournament. A homogeneous set in a 
graph G is a subset V C V(G) such that if a vertex v G V(G) \ V is adjacent to a vertex of V then 
it is adjacent to all the vertices of V. A graph G is prime if all its homogeneous sets other than 
V(G) are singletons. For two disjoint subsets of the vertices X, Y C V(G) of a graph G we say that 
X is complete to Y if every vertex of X is adjacent to every vertex of Y. (Last three definitions are 
valid in both undirected and directed setting). 

A directed path P & (or simply a path Pp, if it is clear from the context that a graph under 
consideration is a tournament) is a tournament with vertex set V(Pp,- ) = {rq,..., vp. } and an ordering 
of vertices (rq,...,tq.) under which the backward edges are exactly of the form: (vi + i,vp) for i = 

1.. ... k — 1. We call this ordering a path ordering. If (tq,..., vp.) is a path ordering of Pp~, then we call 
an ordering (rq, v$, rq, v$, V 4 ,...,) a matching ordering since under this ordering a graph of backward 
edges is a matching. Let Bp k = (E\, ...E,k,) be a sequence of backward edges of this ordering, where 
the backward edges are ordered in Bp k according to the location of their left ends in the matching 
ordering of Pp.. For the ith backward edge Ep we denote by left(i ) the location in the matching 
ordering of Pp. : of the left end of Ep and by right(i ) the location in the matching ordering of Pp. of the 
right end of Ep (lef t(i), right (?) G {1,Notice that if k ^ 4 then a directed path Pp. is prime. 

We are ready to state our results. Our main result is as follows. 

3 . 1 . There exists a universal constant c > 0 such that for any k > 0 there is an algorithm finding a 

c. 

transitive set of order n kXo &W 2 in the P^-free n-vertex tournament in 0{n 2 ) time. 

As a simple corollary we obtain: 

3 . 2 . There exists a universal constant c > 0 such that for any k > 0 there is an algorithm constructing 

^_ c 

acyclic coloring of the P^-free n-vertex tournament using only n fclo « (P 2 colors. Furthermore, the 

3_ c 

algorithm has running time 0{n klo «( fc > 2 ). 

This result immediately implies the following: 


3 



3 . 3 . There exists a universal constant c > 0 such that the dichromatic number of the P^-free n-vertex 
tournament satisfies: 


Xa{Pk) < n 1 kl °s^ . 


It also serves as the O (re. 1 fci°g(*0 2 )-approximation algorithm for the optimal acyclic coloring of 
the -Pfc-free re-vertex graph. 

Let us switch now to the conjecture of Erdos and Hajnal. The conjecture ([7]) says that: 


3 . 1 . For every undirected graph H there exists a constant e(H) > 0 such that the following holds: 
every H-free graph G contains a clique or a stable set of size at least \G\ e ( H K 


In its directed equivalent version ([]]) undirected graphs are replaced by tournaments and cliques/stable 
sets by transitive subtournaments: 


3 . 2 . For every tournament Ft there exists a constant e(H) > 0 such that the following holds: every 
H-free tournament T contains a transitive subtournament of order at least \T\ e ^ H K 

The coefficient e{H) from the statement of the conjecture is called the Erdos-Hajnal coefficient. 
The conjecture was proven so far only for some very special forbidden patterns. Those of them 
that are prime are particularly important since if the conjecture is true for all prime graphs then 
it is true in general (p]). There are no prime undirected graphs of order at least six for which the 
conjecture is known to be true and for a long time that was the case also in the directed setting. 
Very recently an infinite family of prime tournaments satisfying the conjecture was constructed ([4]). 
Among them were directed paths Pf ; . The proof of the conjecture for them provided only purely 
theoretical guarantees since all lower bounds for e{H) were obtained by the regularity lemma. Our 
algorithm gives lower bounds on the Erdos-Hajnal coefficient that are very close to the best upper 
bounds since we have the following ({6]): 


3 . 4 . There exists a constant c > 0 such that every prime tournament H satisfies: 


e(H)< 


dog(l-Hl) 

\H\ 


Combining this result with the lower bounds produced by our algorithm, we obtain the following 
result regarding the asymptotic behaviour of the Erdos-Hajnal coefficients of directed paths Pf ; : 


3 . 5 . The Erdos-Hajnal coefficient of the directed path Pf. satisfies: 

= jfcl+o(l) ' 

So far such precise asymptotics were known only for one infinite family of prime tournaments, 
so-called stars (see: |1] for a definition of a star). Our results make the family of directed paths the 
second class of prime tournaments for which these asymptotics are known. 

In the next section we present algorithms mentioned in Theorem 13.11 and Theorem 13.21 In the 
following section we prove that both algorithms have properties described in these theorems. In 
the last section we summarize our results and briefly discuss possible extensions of the presented 
techniques. 
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4 The Algorithm 


All considered logarithms are of base two from now on. Without loss of generality we will assume 
that k = 2 W for some w > 0. First we will present an algorithm FindTrans that finds in the P^-free 

c 

n-vertex tournament a transitive subset of size n fclog W 2 for some universal constant c > 0 (an exact 
value of this constant may be calculated, but we will not focus on it in the paper). The acyclic 
coloring algorithm AcyclicColoring is a simple application of the former. It runs FindTrans to find 
the first color class, removes it from the tournament, runs FindTrans on the remaining tournament 
to find the second color class, etc. 

4.1 Algorithm FindTrans 

Before giving a description of the algorithm FindTrans, we need to introduce a few more definition. 
For a tournament T and two disjoint nonempty subsets X, Y C V(T) we denote d(X,Y ) = 
where e{X,Y) is the number of directed edges of T going from X to Y. The expression d(X,Y) 
basically encodes directed density of edges from X to Y. 


Input: k > 1 and a-sequence 6 = (A \,..., A]f) of length k\ 

Output: a-sequence 9 S \ 

begin 

let A = and Xk = 4AA; 2 ; 

let C itj = {r G A : \N$(j)\ < | A|(1 - 2 k\ k )} for i, j G {1,...,/*}, i / j; 
update: A <— A \ (J^ Qj; 
output (A,..., Ac); 

end 

Algorithm 1: Algorithm MakeSmooth 

We say that a sequence (A>---, A) of pairwise disjoint subsets of V(T) is a (c, A) -a-sequence of 
length l if the following holds: 

• |A| > c|T| for i = 1 ,..,, I and 

• d(Ai,Aj) > 1 — A for 1 < i < j < l. 

If the parameters c, A of the (c, A) -a-sequence are not important, we simply refer to it as an 
a-sequence. 

We say that a (c, A)-a-sequence {A\, ... ,Ai ) of length l is smooth if the following strengthening of 
the second condition from the definition above holds: 

• d({x}, Aj) > 1 — A for x G A t , 1 < i < j < l and, 

• d(Ai, {y}) > 1 — A for y G Aj, 1 < i < j < l. 

Given an a-sequence 6 = (A\, ...,A{), a vertex v G A and j/i we denote by Njf(j): 

• a set of all outneighbors of v from Aj if j > i and, 
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• a set of all inneighbors of v from Aj if j < i. 

For an a-sequence 9 = (A \,..., A [) we denote: V ( 9 ) = A\ U... U A[. Let Q\ = {A \,..., A{) and O 2 = 
(B 1 , ..., P r ) be two disjoint a-sequences. We denote by 9\ <g) 62 the a-sequence (A\ ,..., Ai, B\, ..., B r ). 
For a set A and m < | A| we denote by tr(A, m) the truncated version of A obtained by taking arbitrar¬ 
ily its m elements. For an a-sequence 9 = (A±,Ai) we denote: tr(0,m ) = (tr(Ai,m), ...,tr(Ai,m)). 

If the order of the given P^-free tournament is too small, the algorithm FindTrans (Algorithm [2]) 
returns a trivial answer (and it is easy to see that this gives good asymptotics on the coefficient e). 


Input: k > 0 and P/ c -free tournament T; 

c 

Output: transitive subset in V(T) of order \T\ kl °zW* ; 

begin 

if |T| = 1 then 

| output V(T); 

end 

let c k = |(p-) log(fc)+1 , where: A = 3 ^; 
if 1 < |T| < A then 
| output any 2-element subset of V(T); 

end 

run CreateSequence(k,T) to obtain an a-sequence 9 of length k\ 
run MakeSmooth(k, 9) to obtain a smooth a-sequence (Ai,..., Ak)‘, 
initialize: 9 S 4— (A \,..., Af.)', 
let 9 s (i) denote the ith element of 9 S ; 
for i = 1, ..., | do 
let u = left(i); 
let v = right(i ); 

if there exists an edge e = (■ y,x ) from 9 s (v) to 9 s (u ) then 
let A' v <— 9 s (v), A' u ■<— 6 s (u) and 
A' t <- 9 S (t) n Ny s (t) n N% s (t) for t € {1,..., k} \ {u,u}; 
update: 9 S <— (A 1 ,...,A k )- 
else 

run FindTrans(k,T\A u ) to obtain a transitive subset Mi; 
run FindTrans(k,T\A v ) to obtain a transitive subset M 2 ; 
output Mi U M 2 ; 

end 

end 

end 

Algorithm 2: Algorithm FindTrans 

Otherwise, the algorithm uses two subprocedures: CreateSequence that constructs in the Pk -free 
tournament T an a-sequence of length k and MakeSmooth that uses that sequence to construct a 
smooth a-sequence of the same length. The CreateS equence procedure does not rely on the structural 
properties of the Pk -free tournaments, but just uses the fact that a tournament it operates on is H -free 
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for some fc-vertex forbidden pattern H. We will discuss it in much detail later. The MakeSmooth 
procedure (Algorithm [lj) is a standard method for getting rid of these vertices from the given a- 
sequence that have many less in/out-neighbors in some element of the a-sequence than the density 
condition would suggest. 


Input: r > 1 and P^-free n-vertex tournament T; 

Output: an a-sequence of length r in T; 

begin 

let V(P k ) = {hi,...,h k }-, 

partition V(T) arbitrarily into k equal-size sets: Si, 

run MakeDensePair{{S\, ...,S k },P k ,n) to get (X,Y), where X,Y C V(T); 

if r = 2 then 

| output (X,Y)] 

end 

initialize: C <— 0, TZ <— 0; 

let si = |A| and S 2 = |T|; 

while |X| > ^ do 

run Create Sequence^, T\X) to obtain an a-sequence L of length 
update: X <- X\V(L), £ G- £ U {L}; 

end 

while \Y\ > ^ do 

run CreateSequence(^,T\Y) to obtain an a-sequence R of length 
update: Y <- Y \ V(R), U^TZU {P}; 

end 

let A = c r = and m = ^per, where c = -p-; 

if exists L € C, R € TZ such that d(V (. L ), V(R)) > 1 — 4A then 
| output tr(L ( 8 > R,m)\ 

else 

| output 9\ <g) 02 , for arbitrary: B\ € L and 62 G TZ ; 

end 

end 

Algorithm 3: Algorithm CreateSequence 

Let us assume now that a smooth a-sequence of length k is given. The algorithm FindTrans tries 
to reconstruct a directed path P k by looking for its ith vertex in the matching ordering of P k in the 
ith element of the a-sequence 0 S . This is conducted backward edge by backward edge in the matching 
ordering of P k . If the backward edge is not found then two linear-size subsets A U ,A V of two distinct 
elements from the original a-sequence such that A u is complete to A v are detected. Otherwise an 
a-sequence is updated. The update is done in such a way that if in the new a-sequence the other 
backward edges of the matching ordering of P k are found then they can be combined with the edges 
that were already found to reconstruct a copy of P k - Since a tournament T the algorithm is working 
on is Pfc-free, at some point of its execution two subsets A u , A v mentioned above with d(A u ,A v ) = 1 
will be detected. When that happens, the algorithm is run recursively on the tournaments: T\X and 
T\Y and later two transitive subsets found in these two recursive runs are merged. 
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Let us discuss now subprocedure CreateSequence (Algorithm [3J) that constructs an a-sequence 
of a specified length r (without loss of generality we will assume that r = 2 W for some w > 0). As 
mentioned before, the procedure can be applied for any forbidden pattern, not only PIts main 
ingredient is called MakeDensePair and is responsible for constructing two disjoint linear sets X, Y 
in the P^-free tournament such that the directed density d(X,Y) is very close to one. 


Input: a set {Sq,..., Sj yp } such that S q U ... U Si p induces a Pk -free tournament, a p-vertex 
tournament H with V(H) = {h^, ..,hi } and parameter n; 

Output: a pair of disjoint sets (X, Y)", 

begin 

let A = and m = 

for each v G Sq and j G {i 2 , ..., i p } let N(v,Sj ) be: 

a set of outneighbors of v from Sj if (/tq, hj ) is an edge and: 
a set of inneighbors of v from Sj otherwise; 
let bad(v ) be: an arbitrary j G {i 2 ,...,i p } such that |7V(u, iSj)| < X\Sj\ or 
0 if such a j does not exist; 
if there exists vq G S n such that bad(vo) = 0 then 
update: Sj tr(N(vo, Sj),\\Sj\) for j G {i 2 , ...,i p }] 
let Snew t {Sj : j G {^ 2 ) ■•■Pp}}\ 
run MakeDensePair (S new , H \ {/iq}, n); 
else 

let Pj = {»G S n : bad{v) = j} for j G {i 2 , ■■■, i p }; 
let j 0 = argmax ie{i2i ... )ip} \Pj\ and Pj Q = tr{P jo ,m)\ 
let {Wi, W r 2 . ...} be a partitioning of Sj 0 into sets of size m; 
if d(Pj 0 ,Sj 0 ) > \ then 

( output (Pj 0 , Wi max ), where l max = arg max/ d(Pj o , W/); 

else 

I output (Wi max ,Pj 0 ), where / maa; = arg max/ d( W t , ); 

end 

end 

end 

Algorithm 4: Algorithm MakeDensePair 

The procedure CreateS equence acts as follows. First two linear sets X , V of the P^-free tour¬ 
nament and with d(X,Y) > 1 — A for some 0 < A ^ 1 are found with the use of the procedure 
MakeDensePair. If r = 2 then (X,Y) is output and the procedure is ends. Otherwise, in both X 
and Y the a-sequences of length | are constructed recursively. When the sequence is constructed, 
it is deleted from X or Y and a new sequence is being constructed in the remaining set. This is 
repeated as long there are at least half of the vertices left in X or Y. Let Xi,X 2 , ■■■ denote the sets of 
the vertices of the a-sequences constructed in X and let Y\, Y 2 , ... denote the sets of the vertices of the 
a-sequences constructed in Y. The algorithm is looking for sets Xi,Yj such that d(Xi,Yj) > 1 — 4A. 
The way sets X, Y were constructed by MakeDensePair as well as simple density arguments (see: the 
analysis of the algorithm) imply that such sets do exist. Thus even though in the formal description 
of CreateS equence we assume that the sets may not be found (and then two arbitrary sets X/, Yj) 






are taken, this in fact will never happen. The a-sequence of length r is output simply by combining 
two a-sequences of length | corresponding to Xj and Yj. 

It remains to explain how the procedure MakeDensePair works (Algorithm HJ). The procedure is 
given a set of sets S^,..., Si p C V(T) of linear size each, for some 1 < p < k, a p-vertex tournament 
H = {hjj,..., hi p }, and a parameter n. Parameter n is the remembered size of the tournament which 
is an input of the CreateSequence procedure initializing the recursive runs of MakeDensePair. 

Notice that T|Sq U ... U Si p is H- free. The procedure tries to reconstruct H in TjS'q U ... U Si p 
in such a way that hi . is found in S t .. It first verifies whether a good candidate for exists in S tl . 
A good candidate should have substantial number of outneighbors in each Sj. such that ( h ^, hi -) 
is an edge in H and a substantial number of inneighbors in each S l;j such that (, h tl ) is an edge 
in H. If such a vertex v in Sq is found then the remaining sets are modified accordingly and the 
algorithm tries to reconstruct H \ {hi 1 } in their modified versions. This is done by a recursive run of 
the procedure on the set of modified sets , Si . Since a tournament that the procedure operates 

on is H- free, at some recursive run no good candidate will be found. As we will see in the theoretical 
analysis, it will imply (by Pigeonhole Principle) the existence of two linear-size sets X, Y with density 
d(X, Y) close to one. These sets will be output by the procedure. 


Input: k > 0 and Pj,-free tournament T; 

1_ c 

Output: an acyclic coloring of T using \T\ colors; 

begin 

initialize: G <— T, V 0; 
while V ( G ) ^ 0 do 

run FindTrans(k, G) to obtain a transitive set M in G; 
update: {M}, G^G\M; 

end 

color each set of V with different color and output this coloring; 

end 

Algorithm 5: Algorithm Acyclic Coloring 

The use of parameter n enables us to output two sets of the same desired size. This balance- 
ness will play important role in the theoretical analysis of the procedure CreateS equence that uses 
MakeDensePair. 

4.2 Algorithm Acyclic Coloring 

The acyclic coloring algorithm (Algorithm [5J is a simple wrapper for the FindTrans procedure. 
It runs this procedure several times to obtain the partitioning of the P^- free tournament T into 
transitive sets. Each transitive set gets its own color and this coloring is as an acyclic coloring that 
is being output. 
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5 Analysis 


5.1 Introduction 

To show that presented algorithms are correct we need to prove Theorem 13.11 and Theorem 13.21 Let 
us assume first that Theorem 13.11 is true. Under this assumption it is easy to prove Theorem 13.21 
Proof. Let e = k\og(k)‘ 2 > where k is as in Theorem 13.11 The Algorithm [5] keeps finding transitive 
subtournaments of order at least (!}) e as long there are at least ^ vertices left in the tournament. 
By the time the algorithm reaches the state with less than ^ vertices remaining, at most 0{n l ~ e ) 
transitive subtournaments are found. Then the algorithm is run on the remaining graph of less 
than ^ vertices. The algorithm stops when there are no vertices left. When it happens all the 
vertices of the tournament are partitioned into transitive subsets. If we denote by H(n ) the total 
number of the transitive subtournaments found then we have the following simple recurrence formula: 
H(n) < 0(n 1-€ ) + -£!(§), which immediately gives us: H(n ) = 0(n 1-e ). Thus we obtain desired 
approximation of the acyclic coloring problem. Since finding each transitive subset takes quadratic 
time and at most (^(n 1 ^ 6 ) transitive subsets are constructed, the total running time of the coloring 
algorithm is as stated in Theorem 13.21 1 

Theorem I3J1 is a result of the series of lemmas: 

5.1. Let A = oojr- A run of Algorithm MakeDensePair from CreateSequence outputs two disjoint 
subsets X,Y of the given n-vertex tournament such that d(X,Y ) > 1 — A and |X| = |F| = cn where: 
c = p-, provided that n > k. 

The next lemma gives us the parameters of the a-sequence constructed by procedure CreateSe- 
quence. 

5.2. Let A = —If n > then Algorithm CreateS equence constructs a (c r , \ r )-a-sequence of 

length r in the given n-vertex tournament, where: c r = c • c = -p-, A r = 4Ar 2 for r > 2 

and A 2 = A. Furthermore, each element of the constructed a-sequence is of the same size c r n. 

The parameters of the smooth a-sequence produced by MakeSmooth from the input a-sequence 
are given in the next lemma: 

5.3. Let A = Assume that the input to the Algorithm MakeSmooth is a (c^, A /-)-a-sequence of 
length k for some Ck > 0 and A k = 4AA; 2 . Then Algorithm MakeSmooth from FindTrans procedure 
constructs a smooth (^, \f)-a-sequence, where: A/ = 4fcAfc. 

The proof of Theorem 13.11 as well as the proofs of the above lemmas are given in the next 
subsection. 

5.2 Proof of Theorem 13.11 

We start with the following simple lemma. 

5.4. Let T be a tournament. Assume that for two disjoint subsets 1,7 C V(T) the following holds: 
d(X, Y) > 1 — A for some A < 1. Assume that X\ C X, Y\ C Y, X\ > ci|X|, Y\ > C 2 |U| for some 
0 < ci,c 2 < 1. Then d(X 1 ,Y 1 ) > 1 - 
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Proof. Let ey x be the number of directed edges from Y to X and let ey 1 x i be the number of 
directed edges from Y\ to X\. We have: 

ey,x = (1 ~ d(X,Y))\X\\Y\ < \\X\\Y\, 

since d(X,Y) > 1 — A. Similarly: ey 1 ,x i = (1 — d(.Xi, Yi))|Xi||Yi|. Assume by contradiction that 
d(Xi,Yi) < 1 — yyy. Then, since X\ > ci|X|, Y\ > C 2 |Y|, we have: ey lt x i > A|A||Y|. Since 
e,yx > ey 1) x 1 , we get: ey t x > A|X||Y|, contradiction. 1 

Let us assume that lemmas: 15.21 and 15.31 from the main body of the paper are correct. We will 
first show how Theorem 13.11 is implied by them. Then we will prove all three lemmas (Lemma 15.11 
will be used to prove Lemma 15.21) . The proof of Theorem 13.11 is given below. 


Proof. 

We will proceed by induction on the size of the P*.-free tournament T. Let e = k > where 
C > 0 is a small enough universal constant. Let us consider first the case when |T| < where 

Ck is as in Algorithm FindTrans. In this setting \T\ is of the order k c fclog ( fc ) for some universal 
constant C > 0 so the output of the algorithm trivially satisfies conditions of Theorem 13.11 Now 
let us consider more interesting case when |T| > yfo Notice then that the requirement from Lemma 
15.21 regarding the size of the input n-vertex P^-free tournament is trivially satisfied. Assuming that 
lemmas: 15.21 and are true, we conclude that initially the a-sequence 6 S from FindTrans is a 
smooth (t^, A/)-a-sequence, where: Ck = c ■ (|) log ( fc ) _1 , A/ = 4/cAfc, A k = 4A&: 2 , c = p and A = yyp. 
Now consider the for-loop in the algorithm. Notice that it cannot be the case that in each run of the 
loop an edge e = (y, x) is found. Indeed, assume otherwise and denote the set of edges found in all 
| runs by {(yi,xi),..., (yk,Xk)}. Denote by <r(yj) this j that satisfies: y* G Aj. Similarly, denote by 


(i(xi) this j that satisfies: Xi € Aj. Notice that the vertices x±,yi, ■■■,Xk,yk induce a copy of Pk and 

J 2 2 

besides the ordering of {xi,y\,Xk , yk } induced by a is a matching ordering under which the set of 

2 2 

backward edges is exactly: {(yi, x\), ..., (yk,Xk)}. This is a straightforward conclusion from the way 

2 2 

the a-sequence 0 S is updated. That however contradicts the fact that the tournament the algorithm 
operates on is P/,.-free. Thus we can assume that in some run of the main for-loop the algorithm 
recursively runs itself on T\A U and T|A„ for some A U ,A V from the given a-sequence. Notice that 
whenever a backward edge (y, x) is found the size of each A- L in the updated a-sequence decreases by 
at most 2 • ^-nXf. Thus at every stage of the execution of the algorithm each Aj is of order at least 
yn — k ■ p n\f which is at least yu (since A/ = 4£:Afc < p). Therefore when two recursive runs of 
the procedure FindTrans are conducted, each run operates on the tournament of size at least yn. 
By induction, a transitive tournament of order at least 2(^n) e is produced. It remains to prove that 

We leave it to 


under our choice of e (for C > 0 small enough) we have: 2(%n) e > n e , i.e e < 




the reader. 

Let us comment now on the running time of the algorithm. First notice that procedure Make- 
DensePair runs in quadratic time. Throughout its execution it is calling itself at most k times and 
the time it takes between any two recursive calls is clearly at most quadratic. This in particular 
implies that procedure CreateSequence also runs in quadratic time. Indeed, throughout its execution 
at most 0{k ) calls of the procedure MakeDensePair are conducted and its other operations take 
altogether at most quadratic time. Furthermore, Algorithm MakeSmooth is clearly quadratic and 
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besides a naive implementation of each run of the for-loop in the procedure FindTrans takes at most 
quadratic time. Thus Algorithm FindTrans has quadratic running time. 

I 


It remains to prove lemmas: 15.1115.21 15.31 We start with Lemma 15.31 
Proof. Let 9 = (A\, ..., A&) be the input a-sequence. By Lemma 15.41 we get: \C\j\ < Thus for 
any i = 1,..., k we have: | (J Ci,j I < This implies in particular that each updated A{ is of size 
at least half the size of the original one. Now take some 1 < i < j < k and a vertex v € A™, where 
A™ is the new version of A t after the update. By the definition of Af we know that v has at most 
2 k\f~\Aj\ inneighbors from Aj. Denote by A" the new version of Aj after the update. Then we can 
conclude that v has at most 4k\k\A™\ inneighbors from A™. Similar analysis can be conducted for 
1 < j < i < k. That completes the proof. | 


Now we prove Lemma 15.21 assuming that Lemma 1 is true. 

Proof. We proceed by induction on r. For r = 2 Algorithm CreateSequence is reduced to procedure 
MakeDensePair thus the result follows by Lemma 15.11 Let us assume now that r > 2. Then, 
by induction and Lemma 15.11 each element of each a-sequence L is of size at least ci. • ^ and at 
most cr • cn. Similarly, each element of each a-sequence R is of size at least a_ • ™ and at most 
cr • cn. In particular, the size of each element of an arbitrary L £ C is at most twice the size 
of each element of an arbitrary R £ 1Z and vice versa: the size of each element of an arbitrary 
R E 1Z is at most twice the size of each element of an arbitrary R E 1Z. By Lemma 15.11 the 
directed density between initial sets X and Y is at least 1 — A. Denote X\ = Ul<=£ and Y\ = 
UfleTj) where: C and 1Z are taken when both while-loops in the algorithm are completed. We 
trivially have: |Xi| > ^ and | Y \j > 4^. Thus by Lemma fo~ll we obtain: d(Xi,Y\) > 1 — 4A. 


Notice that d(Xi,Yi) = ^ L ec,Ren d X (FXAR)W (, L )\X (-ff)l . ug assume fi rs t that there do not exist 
L € £, R € 1Z such that d(V(L),V(R)) > 1 — 4A. But then, by the above observation, we have: 
d(Xi Yl) < Thus HXM < (1 - = 1 - 4A. 

contradiction. Therefore a-sequences Lo,Rq such that d(V(Lo),V(Ro)) > 1 — 4A will be found. 
Notice that, by induction and Lemma 15.11 all elements of Lq are of the same size. Similarly, all 
elements of R$ are of the same size. Thus, by our previous observations and Lemma 15.41 we can 
conclude that in the truncated version of the i?o-part of the output a-sequence the density between 
an element appearing earlier in the sequence and an element appearing later is at least 1 — 4A^. 
Similarly, the directed density between an element of the final output that is from the Lo-part of the 
sequence and the one that is from the iio-P ar t °f the sequence is at least 1 — 4A • 4(|) 2 . This leads 
us to the following recursive formula: X r = max(4Ar,4A • 4(£) 2 ) for r > 2 and A 2 = A. One can 
easily check that this recursion has a solution which is exactly of the form given in the statement of 
Lemma 15.21 Furthermore, trivially each element of the output a-sequence is forced to be of order 
^cl, which leads to the recursive formula on c r from the statement of Lemma 15.21 | 


It remains to prove Lemma 15.11 

Proof. Notice first that output sets X and Y are forced to be of the size given in the statement of 
Lemma 1 5. 11 Indeed, sets: Pj Q and Wi are of size m each which is exactly cn for c = jj. The crucial 
observation is that the longest path in the tree of recursive calls of the procedure MakeDensePair 
is of length at most k. Assume otherwise and choose k consecutive vertices uo constructed in k 
consecutive recursive calls. Denote these vertices as: Vq,...,Vq. Notice that from the way each Vq 
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is constructed we can immediately deduce that {uq,...,Uq} induce a copy of P}-. contradiction. So 
after the procedure MakeDensePair is called first time by CreateSequence, it executes at most k its 
recursive calls. Now notice that the size of the set p. from the input of the procedure decreases 
between its two consecutive recursive calls exactly by a factor of j. Thus when a set Pj 0 is found 
the size of Sio is where it < k is the number of recursive calls that were run. By the definition 

of Pj 0 we have one of two possible options: 

• every vertex of Pj 0 is adjacent to at least (1 — A)|p 0 | vertices of S j0 or 

• every vertex of Pj 0 is adjacent from at least (1 — A)|p 0 | vertices of Sj Q . 

I 

In particular we have: d(Pj Q ,Sj 0 ) > 1 — A or d(Pj Q ,Sj 0 ) < A. Assume without loss of generality 
that the former holds. Then by the same density argument as in the proof of Lemma 15.21 we can 
conclude that d(Pj Q ,Wi max ) > 1 — A. Finally, notice that, as we have already mentioned at the very 
beginning of the proof, both Pj Q and Wi max are of the desired length m. That completes the proof. 

5.3 Infinite families of p-free tournaments with small transitive subsets 

In this subsection we show that our results from the main body of the paper are tight up to the 
log(A;)-factor in the following sense: there exists an infinite family of P-free tournaments with largest 

c log(fc) x 

transitive subsets of order 0(n fc ). Presented construction is based on [6]. We need one more 
definition. Let S,F be two tournaments and denote I / (S') = {si,..., sigi}. We denote by 5 x F a 
tournament T with the vertex set V(T) = V\ U ... U V| 5 |, where each V t induces a copy of F and for 
any 1 < i < j < \S\,x € Vi, y £ Vj we have the following: x is adjacent to y iff s* is adjacent to Sj in 
5. 

Fix k > 0. Without loss of generality we can assume that k > 4. Notice first that there exists a 
universal constant c > 0 and a tournament B on 2 ck vertices with largest transitive subtournaments 
of order k and that is P-free. Such a tournament may be easily constructed randomly by fixing 
2 ck vertices and choosing the direction of each edge independently at random with probability ^ 
(standard probabilistic argument shows that most of tournaments constructed according to this 
procedure satisfy the condition regarding sizes of their transitive subsets and Pt-freeness). 

Now we define the following infinite family F of tournaments: 

• Fq is a one-vertex tournament, 

• F l+ \ = B x Fi for i = 0, 1, .... 

5.5. Each tournament Fi £ F is Pk-free. 

Proof. The proof is by induction on i. Induction base is trivial. Now let us assume that all ps for 
i < io are p-free and let us take tournament -P 0 +i- Denote the copies of p 0 that build p 0 +i as: 
Ti,...,Ti S |. Assume by contradiction that P is a subtournament of p 0 +i that is isomorphic to P^. 
Notice first that | V(P) nF(Tj)| < k for j = 1,..., |<S'|. Indeed, that follows from the fact that clearly 
every Tj is p-free. Now observe that if \V(P)DV(Tf)\ > 0 then in fact \V (-P)nV'(p)| = 1. Otherwise, 
by the definition of F and from the previous observation we would conclude that V(P) 0 V(Tj) is 
a nontrivial homogeneous subset of V(P) but this contradicts the fact that P is prime. But then 
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we conclude that P is a subtournament of B which obviously contradicts the definition of B. That 
completes the proof. □ 

Now notice that the size of F l+ \ is exactly \B\ times the size of F\ and the size of the largest 
transitive subtournament of Fi + \ is exactly tr(B ) times the size of the largest transitive subtourna¬ 
ment of Fi for i = 0,1,where tr{B) stands for the size of the largest transitive subset of B. That 
immediately leads to the conclusion that the size of the largest transitive subtournament of Fi is of 

log (tr(B)) clog (fc) 

order \Fi\ 6 7 8 * lo s(l B l) . The last expression, by the definition of B, is of order | F t j . Therefore T is 

the family we were looking for. 

6 Conclusions 

One can easily notice that our methods can be extended for larger classes of forbidden tournaments, 
for instance tournaments with the ordering of vertices under which the graph of backward edges is 
a matching. It would be interesting to characterize all classes of tournaments for which presented 
method (or its minor modifications) works. The approximation ratio of the proposed algorithm may 
be in practice much better. This is another interesting direction that could be explored. 
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